152 research outputs found

    Character-level and syntax-level models for low-resource and multilingual natural language processing

    Get PDF
    There are more than 7000 languages in the world, but only a small portion of them benefit from Natural Language Processing resources and models. Although languages generally present different characteristics, “cross-lingual bridges” can be exploited, such as transliteration signals and word alignment links. Such information, together with the availability of multiparallel corpora and the urge to overcome language barriers, motivates us to build models that represent more of the world’s languages. This thesis investigates cross-lingual links for improving the processing of low-resource languages with language-agnostic models at the character and syntax level. Specifically, we propose to (i) use orthographic similarities and transliteration between Named Entities and rare words in different languages to improve the construction of Bilingual Word Embeddings (BWEs) and named entity resources, and (ii) exploit multiparallel corpora for projecting labels from high- to low-resource languages, thereby gaining access to weakly supervised processing methods for the latter. In the first publication, we describe our approach for improving the translation of rare words and named entities for the Bilingual Dictionary Induction (BDI) task, using orthography and transliteration information. In our second work, we tackle BDI by enriching BWEs with orthography embeddings and a number of other features, using our classification-based system to overcome script differences among languages. The third publication describes cheap cross-lingual signals that should be considered when building mapping approaches for BWEs since they are simple to extract, effective for bootstrapping the mapping of BWEs, and overcome the failure of unsupervised methods. The fourth paper shows our approach for extracting a named entity resource for 1340 languages, including very low-resource languages from all major areas of linguistic diversity. We exploit parallel corpus statistics and transliteration models and obtain improved performance over prior work. Lastly, the fifth work models annotation projection as a graph-based label propagation problem for the part of speech tagging task. Part of speech models trained on our labeled sets outperform prior work for low-resource languages like Bambara (an African language spoken in Mali), Erzya (a Uralic language spoken in Russia’s Republic of Mordovia), Manx (the Celtic language of the Isle of Man), and Yoruba (a Niger-Congo language spoken in Nigeria and surrounding countries)

    SilverAlign: MT-Based Silver Data Algorithm For Evaluating Word Alignment

    Full text link
    Word alignments are essential for a variety of NLP tasks. Therefore, choosing the best approaches for their creation is crucial. However, the scarce availability of gold evaluation data makes the choice difficult. We propose SilverAlign, a new method to automatically create silver data for the evaluation of word aligners by exploiting machine translation and minimal pairs. We show that performance on our silver data correlates well with gold benchmarks for 9 language pairs, making our approach a valid resource for evaluation of different domains and languages when gold data are not available. This addresses the important scenario of missing gold data alignments for low-resource languages

    Biologically Inspired Modelling for the Control of Upper Limb Movements: From Concept Studies to Future Applications

    Get PDF
    Modelling is continuously being deployed to gain knowledge on the mechanisms of motor control. Computational models, simulating the behaviour of complex systems, have often been used in combination with soft computing strategies, thus shifting the rationale of modelling from the description of a behaviour to the understanding of the mechanisms behind it. In this context, computational models are preferred to deterministic schemes because they deal better with complex systems. The literature offers some striking examples of biologically inspired modelling, which perform better than traditional approaches when dealing with both learning and adaptivity mechanisms. Can these theoretical studies be transferred into an application framework? That is, can biologically inspired models be used to implement rehabilitative devices? Some evidences, even if preliminary, are presented here, and support an affirmative answer to the previous question, thus opening new perspectives

    Prescribing of psychotropic medications to the elderly population of a Canadian province: a retrospective study using administrative databases

    Get PDF
    Background. Psychotropic medications, in particular second-generation antipsychotics (SGAs) and benzodiazepines, have been associated with harm in elderly populations. Health agencies around the world have issued warnings about the risks of prescribing such medications to frail individuals affected by dementia and current guidelines recommend their use only in cases where the benefits clearly outweigh the risks. This study documents the use of psychotropic medications in the entire elderly population of a Canadian province in the context of current clinical guidelines for the treatment of behavioural disturbances. Methods. Prevalent and incident utilization of antipsychotics, benzodiazepines and related medications (zopiclone and zaleplon) were determined in the population of Manitobans over age 65 in the time period 1997/98 to 2008/09 fiscal years. Comparisons between patients living in the community and those living in personal care (nursing) homes (PCH) were conducted. Influence of sociodemographic characteristics on prescribing was assessed by generalized estimating equations. Non-optimal use was defined as the prescribing of high dose of antipsychotic medications and the use of combination therapy of a benzodiazepine (or zopiclone/zaleplon) with an antipsychotic. A decrease in intensity of use over time and lower proportions of patients treated with antipsychotics at high dose or in combination with benzodiazepines (or zopiclone/zaleplon) was considered a trend toward better prescribing. Multiple regression analysis determined predictors of non-optimal use in the elderly population. Results. A 20-fold greater prevalent utilization of SGAs was observed in PCH-dwelling elderly persons compared to those living in the community. In 2008/09, 27% of PCH-dwelling individuals received a prescription for an SGA. Patient characteristics, such as younger age, male gender, diagnoses of dementia (or use of an acetylcholinesterase inhibitor) or psychosis in the year prior the prescription, were predictors of non-optimal prescribing (e.g., high dose antipsychotics). During the period 2002/3 and 2007/8, amongst new users of SGAs, 10.2% received high doses. Those receiving high dose antipsychotics did not show high levels of polypharmacy. Conclusions. Despite encouraging trends, the use of psychotropic medications remains high in elderly individuals, especially in residents of nursing homes. Clinicians caring for such patients need to carefully assess risks and benefits

    A Smo/Gli multitarget hedgehog pathway inhibitor impairs tumor growth

    Get PDF
    Pharmacological Hedgehog (Hh) pathway inhibition has emerged as a valuable anticancer strategy. A number of small molecules able to block the pathway at the upstream receptor Smoothened (Smo) or the downstream effector glioma-associated oncogene 1 (Gli1) has been designed and developed. In a recent study, we exploited the high versatility of the natural isoflavone scaffold for targeting the Hh signaling pathway at multiple levels showing that the simultaneous targeting of Smo and Gli1 provided synergistic Hh pathway inhibition stronger than single administration. This approach seems to effectively overcome the drug resistance, particularly at the level of Smo. Here, we combined the pharmacophores targeting Smo and Gli1 into a single and individual isoflavone, compound 22, which inhibits the Hh pathway at both upstream and downstream level. We demonstrate that this multitarget agent suppresses medulloblastoma growth in vitro and in vivo through antagonism of Smo and Gli1, which is a novel mechanism of action in Hh inhibition

    Improved testing inference in mixed linear models

    Full text link
    Mixed linear models are commonly used in repeated measures studies. They account for the dependence amongst observations obtained from the same experimental unit. Oftentimes, the number of observations is small, and it is thus important to use inference strategies that incorporate small sample corrections. In this paper, we develop modified versions of the likelihood ratio test for fixed effects inference in mixed linear models. In particular, we derive a Bartlett correction to such a test and also to a test obtained from a modified profile likelihood function. Our results generalize those in Zucker et al. (Journal of the Royal Statistical Society B, 2000, 62, 827-838) by allowing the parameter of interest to be vector-valued. Additionally, our Bartlett corrections allow for random effects nonlinear covariance matrix structure. We report numerical evidence which shows that the proposed tests display superior finite sample behavior relative to the standard likelihood ratio test. An application is also presented and discussed.Comment: 17 pages, 1 figur

    An Objective, Information-Based Approach for Selecting the Number of Muscle Synergies to be Extracted via Non-Negative Matrix Factorization

    Get PDF
    Muscle synergy analysis is a useful tool for the evaluation of the motor control strategies and for the quantification of motor performance. Among the parameters that can be extracted, most of the information is included in the rank of the modular control model (i.e. the number of muscle synergies that can be used to describe the overall muscle coordination). Even though different criteria have been proposed in literature, an objective criterion for the model order selection is needed to improve reliability and repeatability of MSA results. In this paper, we propose an Akaike Information Criterion (AIC)-based method for model order selection when extracting muscle synergies via the original Gaussian Non-Negative Matrix Factorization algorithm. The traditional AIC definition has been modified based on a correction of the likelihood term, which includes signal dependent noise on the neural commands, and a Discrete Wavelet decomposition method for the proper estimation of the number of degrees of freedom of the model, reduced on a synergy-by-synergy and event-by-event basis. We tested the performance of our method in comparison with the most widespread ones, proving that our criterion is able to yield good and stable performance in selecting the correct model order in simulated EMG data. We further evaluated the performance of our AIC-based technique on two distinct experimental datasets confirming the results obtained with the synthetic signals, with performances that are stable and independent from the nature of the analysed task, from the signal quality and from the subjective EMG pre-processing steps
    • 

    corecore